Automatic speech summarization based on word significance and linguistic likelihood
نویسندگان
چکیده
This paper proposes a new method of automatically summarizing speech by extracting a limited number of relatively important words from its automatic transcription according to a target compression ratio for the number of characters. To determine a word set to be extracted, we de ne a summarization score consisting of a topic score (signi cance measure) of words and a linguistic score (likelihood) of the word concatenation. A set of words maximizing the score is e ciently selected using a dynamic programming (DP) technique. Japanese broadcast news speech transcribed using a large vocabulary continuous speech recognition system was summarized. As a result 86% of important words in the original speech were correctly included in the summarizing sentences and 72% of the summarizing sentences could maintain the meanings of the original speech under the 60{70% summarization condition.
منابع مشابه
Speech Summarization: An Approach through Word Extraction and a Method for Evaluation
In this paper, we propose a new method of automatic speech summarization for each utterance, where a set of words that maximizes a summarization score is extracted from automatic speech transcriptions. The summarization score indicates the appropriateness of summarized sentences. This extraction is achieved by using a dynamic programming technique according to a target summarization ratio. This...
متن کاملAdvances in automatic speech summarization
This paper reports recent advances in automatic speech summarization method. In our proposed method, a set of words maximizing a summarization score is extracted from automatically transcribed speech. This extraction is performed according to a target compression ratio using a dynamic programming technique. The extracted set of words is then connected to build a summarized sentence. The summari...
متن کاملA Study on Statistical Methods for Automatic Speech Summarization
This dissertation proposes a new automatic speech summarization method through word extraction. In this method, a set of words maximizing a summarization score indicating an appropriateness of summarization is extracted from automatically transcribed speech. This extraction is performed according to a target compression ratio using a dynamic programming technique sentence by sentence. The extra...
متن کاملAutomatic speech summarization based on sentence extraction and compaction
This paper proposes a new automatic speech summarization method having two stages: important sentence extraction and sentence compaction. Relatively important sentences are extracted based on the amount of information and the confidence measures of constituent words, and the set of extracted sentences is compressed by our sentence compaction method. The sentence compaction is performed by selec...
متن کاملTwo-stage Automatic Speech Summarization by Sentence Extraction and Compaction
This paper proposes a new automatic speech summarization method having two stages: important sentence extraction and sentence compaction. Relatively important sentences are extracted from the results of large-vocabulary continuous speech recognition (LVCSR) based on the amount of information and the confidence measures of constituent words. The set of extracted sentences is compressed by our se...
متن کامل